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Abstract — Time series model is a hotspot in the research of statistics. On November 11, 2015, Tmall platform’s turnover 
was more than $91.2 billion which caused the attention of scholars both at home and abroad. So this paper aims to forecast 
sales of Tmall, which is helpful to the enterprises. Research methods are ARIMA model and VAR model. The first model is 
single-variable model and the later is multi-variable model. In the study, ARIMA model makes the sequence smooth by using 
two difference operation. In VAR model, five explanatory variables are transformed into one main component. By contrast, 
VAR model does not give detailed accurate prediction, but ARIMA model does. Therefore, single-variable time series model 
is more suitable for sales forecast than multi-variable model. 
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I. Introduction 

In recent years, the prosperity of electronic commerce makes the enterprises pay more and more attention to marketing 
strategy, especially accurate sales forecast. Time series model is that using historical data related to past behavior to infer the 
future behavior of time sequence. So this paper aims to do an empirical analysis about sales forecast using time series model. 
Industry data such as sales, price, online active stores, online stores clinching a deal, online products and products clinching a 
deal need collecting. Use MATLAB and EVIEWS software to process the data. 

The body of this paper includes four parts. The first part is introduction. The second part is model definition about ARIMA 
model and VAR model. The third part is modeling and forecasting sales. The last part is conclusion. 

II. Model Definition 

Generally, AR(p) formula is as follow: 

% = 0i%-1 + 02% -2 + ••• 0p%-p + ft ( 1 ) 

If the error term is white noise sequence, then the process is described as pure AR(p). On the contrary, /U f is pure MA(q) 
process as follow: 


ff ~ £ t ~ ~ d 2 s t- 2 0 Q £ t- q 

ARMA model[l] process has the model structure as follow: 

K — 1 + fK -2 + ■ ■ ■ p S l - 1 - @fi-2 “ % S t -q 

Autoregressive integrated moving average model, ARIMA model for short, its structure is as follow: 


( 2 ) 


(3) 


fifB)V d x t = 0( B)s t 

<%e t ) = 0, Var( s t ) = o* e ,B,e t e s ) = 0, ,s*t < 4 > 

Ex s s t = 0, Vs < t 

Its basic idea is that let non-stationary time series smooth by difference, so that it can use ARMA model to establish 
stationary time series model. 


Vector autoregressive model [2], VAR model for short, its structure is as follow: 
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* = 8 + + 4^-2 + —tyt-P + u t (5) 

A and ^ are m dimension matrix. U { and 8 are m dimension vector. Let S U t ) = 0,H,U t U, T ) = £ u :j , and 

%L(U s r ) =0 ,s*t 

Above two models, when the reciprocal of characteristic roots are inside the unit circle, they are smooth. 

III. Modeling and Forecast 

Monthly data are used covering the period from January 2010 to December 2014. It was provided by a enterprises. 

3.1 ARIMA modeling 

We need to do pure random inspection. Because only the non-white noise sequence modeling makes sense. LB [3] test result 
is as follows. 


Table 1 
LB Test 


Delayed order 

LB statistic 

P value 

3 

55.339 

0 

6 

67.568 

0 

12 

87.188 

0 


Table 1 show that P value is less than the significance level of 0.05. Therefore, the alternative hypothesis may be accepted. 
This sequence is not white noise sequence. 

Then let us do stationarity test. ADF test shows P value approximates 1, so this sequence is non-stationary sequence. 
Modeling process is as follows: 

To eliminate seasonal fluctuations, do first-order difference operation by step 12. To eliminate trend changes, do first-order 
difference operation by step 1. Differenced sequence is smooth. Considering sample size, let p and q change between 0 to 3. 
When AIC function reaches minimum value, we get a row vector including p and q [4], 


Table 2 

AIC Function Value 


p 

q 

AIC 

P 

q 

AIC 




2 

0 

43.5429 

0 

i 

39.1627 

2 

i 

45.5429 

0 

2 

40.44 

2 

2 

46.3551 

0 

3 

41.6112 

2 

3 

38.3967 

1 

0 

51.4209 

3 

0 

45.5425 

1 

1 

40.2587 

3 

1 

47.5215 

1 

2 

42.1878 

3 

2 

42.6222 

1 

3 

43.4502 

3 

3 

40.1274 


Table 2 shows that the structure of this model should be confirmed as ARMA(2,3). 

Next, let us estimate the parameters and forecast sales. Its formula is as follow: 

X t = 0 . 9763A; 1 + 0. 7508^ 2 - 0. 0607;.; 1 - 0. 0085;.; 2 - 0. 9308;.; 3 (6) 


Forecast can be seen from Fig. 1 
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Fig.1 Sales forecast of 2014 using ARIMA model 


The average error rate calculated is 0.2313, accurate for sales. The residual sequence is white noise. 
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Fig.2 The residuals of ARMA model and its ACF and PACF 
3.2 VAR modeling 

Factors that affect sales are five variables, so it is suitable to use principal component analysis or PCA [5] for short to reduce 
dimensions before establishing the VAR model. 


Table 3 

Results of PCA 


Number 

Principal Component 

Contribution Rate 

Cumulative Contribution Rate 

1 

zl 

0.9654 

0.9654 

2 

z2 

0.0343 

0.9996 

3 

z3 

0.0003 

1.0000 

4 

z4 

0.0000 

1.0000 

5 

z5 

0.0000 

1.0000 


The contribution rate of the first principal component is above 90%. Its formula as follow: 

z, = -0. 0039^ + 0. 0288x 2 + 0. 0047x 3 + 0. 9559x 4 + 0. 2922x 5 


(7) 
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The VAR model uses two variables, namely zl and sales. Firstly, stationarity test shows that when p < 6, that is, the 
reciprocal of characteristic roots are inside the unit circle, model is smooth. 

Inverse Roots of AR Characteristic Polynomial Inverse Roots of AR Characteristic Polynomial 
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Fig.3 The reciprocal of characteristic roots distribution about VAR(6) and VAR(7) 




Similar with ARIMA model, using MATLAB software to calculate AIC function. The structure of this model is eventually 
determined as VAR (6). Its parameters calculated can be seen on equation (8): 

sal es = 0. 56 • sal es( -1) - 0. 04 • sal es{ -2) - 0. 58 • sal es( -3) 

+0. 28 • sal es{ -4) + 0. 22 • sal es{ -5) - 0. 45 • sal es{ -6) 

+0. 85 • z1( -1) - 0. 32 • z\ -2) - 0. 24 • z\ -3) (8) 

+0. 09 • z\ -4) - 0. 05 • z\ -5) + 0. 16 • z\ -6) 

+19701.23 



VAR model can grasp the change of time series from the macroscopic aspect. But its precision is not high. 
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IV. Conclusion 

As single variable model, ARIMA model establishes equation by calculating ACF and PACF, the residuals included. 
According to akaike information criterion, the model structure is identified as ARMA (2,3). It is very precise and credible. 
With regard to multi-variable time series model, it did principal component analysis to reduce dimensions. Five explanatory 
variables are transformed to one main component. VAR model has two variables. On the contrary, prediction accuracy of 
VAR model is not high. It is suitable for macro -economic analysis. 

This paper studies the issues of sales forecast, which belongs to the category of microeconomic. It comes to a conclusion 
that single-variable time series model is more suitable for this problem, especially the stochastic time series models. 

Although we find an ideal time series model, sales forecast in November may have too big error. This is a problem of 
structural breaks, which need further research. 
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